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The problem of estimating a parameter of a quantum system through a series of measurements 
performed sequentially on a quantum probe is analyzed in the general setting where the underlying 
statistics is explicitly non-i.i.d. We present a generalization of the central limit theorem in the present 
context, which under fairly general assumptions shows that as the number N of measurement data 
increases the probability distribution of functionals of the data (e.g., the average of the data) through 
which the target parameter is estimated becomes asymptotically normal and independent of the 
initial state of the probe. At variance with the previous studies [M. Gu^a, Phys. Rev. A 83, 062324 
(2011); M. van Horssen and M. Guta, J. Math. Phys. 56, 022109 (2015)] we take a diagrammatic 
approach, which allows one to compute not only the leading orders in N of the moments of the 
average of the data but also those of the correlations among subsequent measurement outcomes. In 
particular our analysis points out that the latter, which are not available in usual i.i.d. data, can be 
exploited in order to improve the accuracy of the parameter estimation. An explicit application of 
our scheme is discussed by studying how the temperature of a thermal reservoir can be estimated 
via sequential measurements on a quantum probe in contact with the reservoir. 


I. INTRODUCTION 

Seeking the most efficient way to recover the value of 
a parameter g encoded in the state Pg of a quantum sys¬ 
tem is the fundamental problem of a branch of quantum 
information technologies [1], which goes under the name 
of quantum metrology [2, 3]. It goes without mentioning 
that this topic has applications in a variety of different 
research areas, ranging e.g. from the interferometric esti¬ 
mation of the phase shifts induced by gravitational waves 
[4], high-precision quantum magnetometry [5, 6], to re¬ 
mote probing of targets. 

In the standard approach one typically focuses on the 
case where several (say N) identical copies of pg are avail¬ 
able to experimentalists, who can hence rely on the sta¬ 
tistical inference extracted from independent and iden¬ 
tically distributed (i.i.d.) measurement outcomes to es¬ 
timate the value of g. This scenario is particularly well 
formulated by those configurations where the unknown 
parameter g is associated with some black-box transfor¬ 
mation Ag (say a phase shift induced in one arm of an 
interferometric setup) which acts on the input state po 
of a probing system (say the light beam injected into the 
interferometer) yielding pg = Ag(/5o) as the output den¬ 
sity matrix to be measured, with such test repeated N 
times to collect data {si,... ,sn}- See Fig. 1(a). In this 
context, the ultimate limits on the attainable precision in 
the estimation of g, optimized with respect to the general 
detection strategy, can be computed, resulting in the so- 
called quantum Cramer-Rao bound, which exhibits the 
functional dependence upon pg via the quantum Fisher 
information. See e.g. Refs. [2, 3, 7-12]. 

In many situations of physical interest, however, the 
possibility of reinitializing the setup to the same state 
is not necessarily guaranteed. In the present study we 
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FIG. 1. (a) The standard strategy for estimating a parameter 
g of a quantum system, where measurement data {si,..., sjv} 
are collected by independent and identical experiments. Ev¬ 
ery time the experiment is performed, the system is reset 
to some specific known initial state po. (b) The sequential 
scheme for estimating a parameter p of a quantum system, 
where the measurements are performed sequentially to col¬ 
lect data {si,..., Siv} without resetting the state of the system 
every after the measurement and the initial state po can be 
arbitrary. 


are going to consider a different scheme, in which a sin¬ 
gle probing system undergoes multiple applications of Ag 
while being monitored during the process without be¬ 
ing reinitialized to the same input state. See Fig. 1(b). 
The data {si,..., sa?} collected by such sequential mea¬ 
surements will be non-i.i.d. in general. Still, we are able 
to estimate the target parameter g from the data under 
certain conditions. We will see that the property of the 
channel describing the process is important. The idea is 



2 


to let the probing system forget about the past by the 
mixing of the channel [13-16] (the channel being intrin¬ 
sically mixing or designed to be mixing), which clusters 
the data and allows the central limit theorem to hold 
for appropriately chosen functionals of the data. This se¬ 
quential scheme is suited to account for estimation proce¬ 
dures where one aims to recover g via a sequence of weak 
measurements that slightly perturb the probe. In partic¬ 
ular it can be adapted to study physical setups where the 
probing system is a proper subset of a many-body quan¬ 
tum system which is directly affected by the black-box 
generator Ag (an explicit example of this scenario will be 
analyzed in the final part of this paper). 

Various schemes for quantum parameter estimation 
based on repetitive or continuous measurements have 
been studied: see e.g. [17-25]. Among them, analogous 
setups were analyzed in Refs. [19, 24], where the prob¬ 
lem was formalized in terms of quantum Markov chains. 
Specifically in Ref. [19] it has been shown that, under 
rather general assumptions, the statistics of the associ¬ 
ated estimation problem converges asymptotically to a 
normal one, generalizing the similar results which were 
known to apply to purely classical settings [26-28]. In 
the present paper we first provide an independent deriva¬ 
tion of the previous result [19] via a diagrammatic ap¬ 
proach to compute the leading-order contributions to 
the moments of the associated estimating functional of 
the data {si,..., sn}, i.e., the moments of the average 
S = -^ This approach allows us to prove the 

central limit theorem including other estimating func¬ 
tionals capturing the correlations among different mea¬ 
surement outcomes, e.g., Ce = The 

asymptotic normality of the empirical measure associ¬ 
ated to chains of subsequent measurement outcomes is 
proved in Ref. [24], but in contrast to this previous work 
we provide explicit formulas which allow us to evaluate 
the elements of the covariance matrix of the normal dis¬ 
tribution of the variables S and C'i. Moreover, we point 
out that the inclusion of the correlations Ce for estima¬ 
tion, which do not contain any useful information in the 
usual i.i.d. data, can help improve the accuracy of the 
estimation. This result, while not conclusive, is a pre¬ 
liminary (yet nontrivial) step towards the determination 
of the ultimate accuracy limit attainable in the non-i.i.d. 
settings. 

This paper is organized as follows. In Sec. II we in¬ 
troduce the notation and recall some basic mathematical 
facts which will be used in the paper. The non-i.i.d. esti¬ 
mation model is then presented in Sec. III. In Sec. IV we 
focus on the simplest estimating functional S of the mea¬ 
surement data, and prove its asymptotic normality un¬ 
der the assumption of mixing of the process. The central 
limit theorem is generalized to include the correlations 
Cl and their role in the estimation problem is addressed 
in Sec. V. An explicit example is then presented in Sec. 
VI, where we discuss the estimation of the temperature 
of a thermal reservoir via local measurements on a quan¬ 
tum probe in contact with the reservoir. Conclusions 


and perspectives are summarized in Sec. VII, while some 
technical elements are presented in the Appendices. 

II. NOTATION AND MATHEMATICAL 
BACKGROUND 

In this section we introduce the notation and recall 
some basics facts on the theory of quantum channels. 

A. Quantum Ergodic/Mixing Channels 

Quantum channels are completely positive and trace¬ 
preserving (CPTP) maps, transforming density operators 
to density operators [29-31]. Every CPTP map £ admits 
at least one fixed point [13, 14], namely, a stationary state 
p*, 

£{p*)=p*, (2.1) 

which is hermitian, positive-semidefinite, and of unit 
trace. In other words, the fixed point p* is an eigenstate 
of the map £ belonging to its unit eigenvalue 1. 

If the fixed point p* is unique, the quantum channel £ 
is called ergodic [14-16]. It implies 

7V-1 

— ^ £^”(po) P*, V states po, (2.2) 

n—0 

where 5" = £ o ■■■ o £ denotes n recursive applications 
of the channel £, and the convergence is in the superop¬ 
erator norm with corrections whose leading order scales 
as 1/N. Moreover, if the fixed point p* is unique and 
the unit eigenvalue 1 is the only peripheral eigenvalue 
(eigenvalue of unit magnitude), the quantum channel £ 
is converging as 

£^(po) p*, V states po, (2.3) 

and is called mixing, with the convergence being as in 
(2.2) [13-16]. Mixing implies ergodicity, but the converse 
is not necessarily true. 

As commented above, the fixed point p* of a quan¬ 
tum channel £ is an eigenstate of £ belonging to its unit 
eigenvalue 1. In a matrix representation oi £, it is a 
“right eigenvector.” The corresponding “left eigenvec¬ 
tor” belonging to the same eigenvalue can be different 
from the right eigenvector in general. For a quantum 
channel £, the trace Tr is a left eigenvector belonging 
to the unit eigenvalue 1, since the quantum channel £ is 
trace-preserving, 

Tr{£:(p)} = Trp. (2.4) 

Let us hence write the fixed point p* and the trace Tr in 
the vectorized notation as 

p* -f-)- jp»), Tr 0 (Ij, 


(2.5) 
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respectively, and a couple of eigenvalue equations for the 
unit eigenvalue 1 read 

£\p,) = \p,), (1|£: = (1|. (2.6) 

More explicitly, given any complete set of orthonor¬ 
mal basis states {|n)}„ of the system, an operator A = 
J2n,n' ^ I is vectorized by |.d) — ^ 

|n') [32], The trace (1| = ® (^1 i® hermitian 

conjugate of) the vectorized version of the identity oper¬ 
ator: that is why it is denoted by (1|. In addition, the 
inner product {A\B) = Tt{A^B} is the Hilbert-Schmidt 
inner product. In this representation, the quantum chan¬ 
nel £ ^ J2rn,n,m',n' ^rnn,m'n'\'m){m'\(^\n){n'\ is a matrix 

with the matrix elements £mn,m'n' = {m\£{\m'){n'\)\n) 
in the original representation, and the application of a 
map £ is expressed by the multiplication of the corre¬ 
sponding matrix. By abuse of notation we use the same 
symbol £ for its matrix representation. 

In this matrix representation, the eigenvalue equation 
for £ reads 

£\Un) — ^n\Un)i = A7^('C7^|. (^■'^) 

In particular, |mo) = |p») and (uqI = (1| with Aq = 1. 
The eigenvectors belonging to different eigenvalues are 
orthogonal to each other and normalized as 


which proves (2.3). If the channel £ is ergodic but not 
mixing, £' admits a peripheral eigenvalue, and £'^ does 
not decay: we lose the convergence (2.3), but the aver¬ 
aged channel converges as (2.2). 

Note again that £' might not be diagonalizable if some 
of the eigenvalues A„ of £ are degenerated, but it is not 
a problem for the convergence: see [13-16]. 

B. Measurement and Back-Action 

We recall that in quantum mechanics the most general 
detection scheme can be formalized in terms of positive 
operator-valued measure (POVM). See e.g. Ref. [29]. Ex¬ 
pressed in the superoperator language this accounts to 
assigning a collection Wl = {Ads}* of trace-decreasing 
channels A4s describing the statistics of the measurement 
and the back-action on the probed system. In particular, 
given p the density matrix of the system before the mea¬ 
surement, the probability of getting outcome s by the 
measurement DJI is given by 

p{s\p) = Tt{M,{p)} = {1\M,\p), (2.13) 

with Ais{p) being the conditional (not normalized) state 
immediately after the event. By construction the map 


{Vn\'^n) — 1) ('Cmj'an) — 0 fol" 7^ A^^. (2-8) 

Note that the matrix £ might not be diagonalizable but 
is cast in the Jordan canonical form in general [13, 14]. 

In this paper, ergodic or mixing channels will play a 
central role. The unit eigenvalue 1 of such a channel £ 
is not degenerated by definition, and the ergodic/mixing 
channel £ can always be decomposed as 

E = V,+£', (2.9) 

where 

P, = 1p*)( 11 =p*Tr{.} (2.10) 

is the eigenprojection belonging to the nondegenerate 
unit eigenvalue 1 oi £, and the remaining part £' (which 
is not CPTP) is built on the eigenvectors {l'u„)}„^o 
and {{vn\}n^o belonging to the eigenvalues A„ different 
from 1. By construction £' is orthogonal to 7^*, i.e., 
V*£' = £'V* = 0. Moreover, since £' does not admit 
a unit eigenvalue 1, the inverse (1 — 5')“^ exists, and we 
have 


1 

N 


N-l 

^ 5" = 7^, + 

n—0 


1 l-£'^ 
N !-£' 


Q* 


( 2 . 11 ) 


with Q* = 1 — 7^*, which allows us to prove the conver¬ 
gence to the fixed point p* in (2.2). If the channel £ is 
not only ergodic but also mixing, the spectral radius of 
£' is strictly smaller than 1, and we get 

( 2 . 12 ) 


(2.14) 

S 

obtained by summing over all possible values of s, is 
CPTP and describes the evolution of the system when 
no record of the measurement outcome is kept. We also 
notice that given V and £ two CPTP maps, the set of 
channels = £A4s7J defines a new POVM measure¬ 
ment DJI' = {Al(;}s, where immediately before and af¬ 
ter the measurement DJI one transforms the state of the 
probed system through the actions of V and £, respec¬ 
tively. Finally we observe that given DJt = {A4r}r and 
91 = {A/’s}s two POVMs, the operator (A/), oM.r){p) rep¬ 
resents the conditional (not normalized) state obtained 
when the measurements DJt and 91 are performed on a 
system in the state p yielding measurement outcomes r 
and s, respectively, the associated probability given by 
pir,s\p) = (IJA^Alr-lp)- 


III. SEQUENTIAL SCHEME 

The problem we study is the following: we wish to 
recover an unknown parameter p of a quantum system, 
which is encoded in the state of a quantum probe via the 
action of a quantum channel Ag, 

Po 'A- Ag{po). (3-1) 

Here po is the input state of the probe, which (possibly) 
is initialized by us, while Ag(po) is the associated output 
state, on which we are allowed to perform measurement 
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in order to learn about g. In a standard i.i.d. approach 
[10], one is supposed to perform the same experiment 
several times collecting i.i.d. outcomes {si,..., s^}, from 
which the value of g is to be extrapolated via some suit¬ 
able data processing. See Fig. 1(a). More precisely, in 
every experimental run of such an i.i.d. scheme the probe 
should be initialized in the same input state po s-nd the 
same POVM measurement 9)1 = {Als}s should be per¬ 
formed after Ag has operated on the probe. On the con¬ 
trary, in the protocol we are going to discuss here, while 
we keep performing the same measurement 9Jl on the 
probe, the probe is not reset to po after each measure¬ 
ment step. Instead, we just repeat the application of Ag 
followed by a measurement many times to get a sequence 
of outcomes {si,..., sj^}, whose statistics is not neces¬ 
sarily i.i.d. anymore. See Fig. 1(b). In this scenario, 
following the framework detailed in Sec. II, the state of 
the probe undergoes a conditional evolution described by 
the (not necessarily normalized) density matrix 

Po ' ^ (^sAf 0***0 Ssi ) (po)) — AAs o Ag , (d.2) 

whose trace 

p(si,..., sjvIpo) = (l|fs„ |po) (3.3) 

defines the probability of the associated measurement 
event. 

It is worth observing that this mathematical setting 
includes the i.i.d. scenario as a special case, where Ag is 
identified with Ag o T^o, with Vo = |po)(l| = PoTr{ • } 
being the map resetting the state of the probe into po- In¬ 
deed, with this choice the probability (3.3) coincides with 
the one for the case where the measurements 9Jl = {Adsjs 
are performed independently on N copies of Ag(po), i.e., 

p(si, . . . , Sn\Po) = (l|^s«|Po) * * * (lI'Sss |Po)(l|'Ssi jpo) 

= p(siv|po) * * ■pis 2 \po)p{si\po)- (3.4) 

From (2.14) it follows that the map S obtained by sum¬ 
ming £s in (3.2) over all s, 

£ = J2£s=MoAg, (3.5) 

S 

is CPTP, ensuring the proper normalization of the prob¬ 
ability (3.3). As an additional constraint we will require 
it to be mixing (in some cases, e.g., in Sec. IV A, how¬ 
ever, we will weaken this requirement by imposing £ to 
be just ergodic). This is not a strong assumption, as mix¬ 
ing channels actually form an open and dense set. Under 
this condition we will be able to prove that the parame¬ 
ter g can be estimated from the single sequence of data 
{si,...,SAr} collected by the sequential measurements, 
irrespective of the initial state po [19]. The rough idea 
is that, thanks to the mixing (2.3), repeated applications 
of the channel force the quantum system to forget its 
initial state and at the same time decorrelate the data 
separated beyond the correlation length, which clusters 
the data and allows us to define self-averaging quantities 
as estimating functionals, whose fluctuations diminish as 
N increases, i.e., the central limit theorem holds. 


Inferring g from {si,.. . ,sjv} 


A standard estimating functional of the measured data 
{si,..., sn}, through which one tries to infer the value 
of g, is the average 

1 ^ 

5=-^s,. (3.6) 


In Ref. [19] it was noted that under the assumption that 
the average channel £ in (3.5) is mixing the central limit 
theorem holds for S, and for large N the probability dis¬ 
tribution P{S) of S asymptotically becomes a Gaussian 
peaked at a value (S')* with a shrinking variance jN^ 
which are both independent of the input state po of the 
probe, i.e., 


P(S) 


X (S-(S)*)=^ 

Q 2o *^/N 

jN 


(3.7) 


The explicit expressions for (S)* and a will be provided 
in (4.4) and (4.12), respectively, in Sec. IV below. This 
ensures that the quantity S evaluated from the single 
sequence of measurement outcomes is expected, with a 
high probability, to be very close to its expectation value 
(S)* with a vanishingly small variance cP' jN for large N. 
Therefore, by comparing the observed value of S with 
the formula for the expectation value (S)* as a function 
of g, one can infer the parameter g. It is worth stressing 
once more that in the sequential scheme the measurement 
data are not independent of each other. Therefore, it 
is not trivial whether the central limit theorem holds, 
which is usually based on i.i.d. data set. The mixing, 
however, is strong enough to kill the correlations between 
two data if they are sufficiently far away from each other, 
and clusters the data, allowing the central limit theorem 
to hold. 

Thanks to (3.7) the uncertainty in the estimation of g 
through the quantity S can be evaluated via the Cramer- 
Rao bound as [3, 7-12, 33] 


5g 


1 

TWy 


(3.8) 


where iF{g) is the Fisher information of the problem given 

by 

(3.9) 

Accordingly, as long as (S')* exhibits a nontrivial func¬ 
tional dependence upon g, the Fisher information P(p) 
increases linearly in N, yielding an estimation error (3.8) 
which diminishes as Sg ~ 1/\/N [in (3.9) we have omitted 
the contribution from dajdg to the Fisher information 
T{g) since it does not grow with N], 

It may happen however that the quantity (S)* does 
not depend upon g. In such a case iF{g) nullifies, sig¬ 
naling that it is impossible to recover g through S [a 
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problem which cannot be fixed by properly choosing the 
input state po of the probe, the asymptotic distribution 
(3.7) being independent of po]- Nonetheless, even in this 
particular case, the sequence of data {si,..., sn}, which 
is not i.i.d. in general, can still contain some functional 
dependence upon g, which can be exploited for the esti¬ 
mation of g. In particular, the aim of the present work 
is to show that the correlations among the measurement 
data, which are absent in the usual i.i.d. data, can be 
used for this purpose. It turns out that, under the same 
mixing assumption on the channel £ that leads to the 
central limit theorem for S in (3.7), the correlations are 
also self-averaging and become asymptotically normal for 
large N, enabling one to estimate g through them. See 
(5.21) and (5.22) in Sec. V below. Even in the case where 
(S')* depends upon p, looking also at the correlations help 
enhance the precision of the estimation of g, which will 
be demonstrated in Fig. 10 with the example studied in 
Sec. VI. 

We hrst present an alternative derivation of the results 
of Ref. [19], i.e., the asymptotic normality of S, on the 
basis of a diagrammatic approach in Sec. IV. While our 
approach is more involved than the elegant perturbative 
approach taken in Ref. [19], it allows us to generalize the 
scheme to include the correlations among the measure¬ 
ment data in a straightforward manner to enhance the 
precision of the estimation. We shall indeed prove the 
asymptotic normality of variables including the correla¬ 
tion functionals in Sec. V. 


IV. STATISTICAL BEHAVIOR OF S 

This section is devoted to provide an alternative 
derivation of the results of Ref. [19], which ultimately 
leads to the asymptotic normality of S in (3.7). We start 
in Sec. IV A by proving that under the hypothesis that 
the average channel £ in (3.5) is ergodic the quantity 
S is self-averaging, converging to a fixed value (S')* in¬ 
dependent of the input state po- Then in Sec. IVB we 
introduce the mixing property and show that under this 
stronger condition the distribution P{S), which rules the 
statistics of S, becomes asymptotically normal. 


A. Law of Large Numbers by Ergodicity 

Consider the expectation value of the quantity S with 
respect to the probability (3.3) governing the statistical 
distribution of the measurement outcomes, i.e., 

( S )* = E-"E Sp{si,...,sn\po) 

Si sn 

1 ^ 

i—1 Si Sn 


1 ^ 

= (4.1) 

where is defined by 

(4.2) 

s 

Assume here that the channel £ is ergodic with unique 
fixed point p*: using (2.11), the right-hand side of (4.1) 
can be written as 

1 1 _ 

(S)^ = (S)*+ -(!]£:« ^-^Q*1po). (4.3) 

The first contribution 

(A)* = (l|£:^')|p*) = ;^sTr{£:,(p*)} = (s)* (4.4) 

s 

is the value of (S) n when the input state po of the probe 
coincides with the fixed point p* oi £. As stressed by 
the last identity, it also coincides with the expectation 
value associated with the i.i.d. measurement on p* with 
the POVM DJI. The second contribution in (4.3) instead 
is a correction which scales at most as 1/N for any other 
choice of po. Accordingly in the large-V limit we get 

{S)n (5)*, (4.5) 

irrespectively of po- 

In a similar way we can compute the variance of S, 
obtaining 

iAS)% = {S^)M-{SrN 

i=i 

+ 0{1/N^), (4.6) 

where {S‘^)n is defined similarly to {S)n in (4-1), and 
yM=^(5s)-y„ <5s = s_(s),, (4.7) 

s 

while 

(As)2 = (s2)*-(s)2 = (l|y(2)|p,) (4.8) 

is the variance of s in the stationary state p* as in 
(4.4). Equation (4.6) shows that the variance shrinks 
as (AS')^ ^ 1/A^, and the fluctuation of S around {S)n 
becomes smaller and smaller as we proceed with the mea¬ 
surements. As a result, the probability of Hnding a single¬ 
shot value S close to its expectation value (S)^ becomes 
very high. Indeed, Chebyshev’s inequality bounds the 
probability of S deviating from (S) n as 

Prob(l5 - {S)n\ > K{AS)n) < ^ (4.9) 

for any positive K. In this way, S is self-averaging: each 
single S is very close to its expectation value with very 
high probability. In addition, as shown in (4.5), {S)iq 
becomes independent of the initial state po. 
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B. Beyond the Law of Large Numbers: 
Central Limit Theorem by Mixing 


We have so far assumed the ergodicity of the channel 
£■. this is necessary and sufficient for the convergence 
{S)n —)■ (S')* in (4.5), and for the shrinking variance 
{AS)j^ ~ 1/A^ in (4.6). If we further assume that £ is 
mixing, we can say more. For instance, the third contri¬ 
bution to the variance {AS)% in (4.6) decays as £'^/N 
[note that the sum over j accumulates to 0 (A^)], i.e., 
faster than 1 /A^ (it is not guaranteed under the ergodic¬ 
ity, since £'^ does not decay), and the variance (AS)^ 
asymptotically becomes independent of the initial state 
po- This is because the mixing makes the system forget 
the initial state po as (2.3) without averaging along the 
time trace. 

Most importantly, if £ is mixing, one can prove that 
the probability distribution of S asymptotically becomes 
normal, converging to the Gaussian distribution (3.7). 
The asymptotic normality of S under the mixing condi¬ 
tion was proved in Ref. [19]. Here we derive the same 
result by introducing a diagrammatic approach. Specif¬ 
ically, in the following subsection we shall compute the 
moments of the variable S — {S)», showing that for large 
N they admit the scaling 

((5-(5)*)")jv-^0(l/iVrtl), (4.10) 


where \x\ denotes the smallest integer not less than x. In 
particular for even n we shall see that the leading-order 
term is given by 

(4.11) 

with 

= (1|£(2)|p^) + 2(l|5(i) ^^5(i)|p*), (4.12) 


where and are defined as in (4.7). These results 
allow us to conclude that the characteristic function for 
the scaled variable x = VN{S — (S')*) becomes asymp¬ 
totically normal in the limit A —>■ oo. Indeed by direct 
substitution we have 


y(fc) = 


= ^ (,A)Va)"((s-(s)*)")^ 

n\ 

n—0 


N—¥(yci 






r—0 


(4.13) 


Accordingly the central limit theorem holds and in the 
limit A —>■ oo the probability distribution P{x) of x con¬ 
verges to a Gaussian peaked at a; = 0 with variance 
i.e., P{x) —> e~^ which in the original vari¬ 

able implies (3.7). 


Diagrammatic Approach to Evaluate the Moments of S 

The expression for the first moment of S — (S)* follows 
from (4.3) and is equal to 

1 1 _ f't't 

{s-{s),)n = (4.14) 

which scales as 1/A as anticipated. Analogously the sec¬ 
ond moment is readily obtained from (4.6) by noticing 
that 

{ASrM = {{S - {S).r)N - {{S)n - (S)*)2. (4.15) 

For future reference we find it useful to rederive it: 

= E ■ ■ ■ E('^ “ (S)*)^p(si,..., sn\po) 

Si sn 

= E E E ■ ■ ■ E jpo) 

i—l j — 1 Si Sn 

i^l 

+ (4-16) 

j=2 i=l 

To simplify this we insert the decomposition of the er- 
godic channel £ given in (2.9), namely, we insert 7^* = 
|p*)(l| or £' in place of £. Notice however that 

(5s)* = (l|£(i)|p*)=0. (4.17) 

Due to this condition, the places in which we can insert 
Pt are limited. The nonvanishing contributions to the 
second moment hence read 

N N 

9 N 3-1 
j=2 i=l 

j^2 

(4.18) 

and the direct computation of the summations yields 
(Al) and (4.6). 

It is now clear why the second moment (AS')^ in (4.6) 
as well as the second moment {{S — (5')*)^)7V scales as 
1/A. There are two cases, as we saw in (4.16): (i) two 
points dsi and 6sj coincide (i = j) and we have a sin¬ 
gle summation 7 ^ (ii) points 6si and 5sj do 
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not coincide {i 7 ^ j) and we have double summations 
'W In any case, once with some power k 

(= j — * — 1 or i — 1 in the above formula for the second 
moment) is substituted by 7^*, a summation accumulates 
as C>(1)) while the contribution from S' does not: 

recall the geometric series in ( 2 . 11 ), where the contribu¬ 
tion from S' remains 0(l/iV). Thus, the substitution 
rules for estimating the scaling are: 

0(1) and ^Y.S'^ ^Oil/N). (4.19) 

Due to (l|£^^^|p*) = 0 and the coincidence of 5si and 5sj, 
we can insert at most one 7^* in place of £ for the second 
moment ((S' — (S)*)^)Ar: see (4.18). Therefore, the sec¬ 
ond moment ((S — {S)tY)M is at most 0(1/7V), and so 
is the variance {AS)%. Note that the second substitu¬ 
tion rule in (4.19) is not valid if the channel £ is ergodic 
but not mixing. Indeed, in such a case, the last term in 
(4.18) yields 0(1/7V), as mentioned in the beginning of 
this subsection. The rule is safe if £ is mixing. 

We can generalize the above way of estimating the 
scaling to higher central moments, but a bit more so¬ 
phisticated rules are required to check the asymptotic 
normality: we need to care about not only the scalings 
but also their coefficients. Anyway, the basic strategy 
to collect the leading-order contributions is to try to in¬ 
sert V* as many times as possible in place of £ avoiding 
(l|£*'^^|p») = 0. Another important observation is that 
the insertion of 7^* = |p*)(l| “breaks” the process into 
pieces. Recognizing these points, we introduce a dia¬ 
grammatic way of representing the contributions to the 
moments. 

The nth moment is given by 

{{s-{s).r)N 

N N 

xE--E dsji • • • 5si^{l\£ Sn ‘ ' ^Si \Po)- 

Si Sn 

(4.20) 

Within the summations over {zi,... ,z„} we relabel the 
n points ..., in chronological order 1 < A < 

■ ■ ■ < iN < N and represent them by n dots lined up 
in chronological order from right to left. See Fig. 2(a). 
The right most “o” represents the initial state |po)j and 
a trace (1| is supposed to be at the left end. The points 
can coincide [ii = as in the case i = j for the second 
moment: see (4.16)], while between nondegenerate points 
{ii < ii+i) there are (with a convention zq = 

0 ), which are to be substituted by 7^» or 
as we did for the second moment [we need Q* to remove 
7^* from when z^+i — z^ — 1 = 0: see (4.18)]. 

Now, 

i. When between two points is substituted 

by we connect the two points by a 

solid line. 


— 1 '^2 Po 

• • -- • • O 


(b) 


ie+i ii 


—ii — 1 


Q* 


- • V. 


g{m) 


FIG. 2. (a) The n points ,..., in the nth moment 
{{S-{S). )”)jv are labeled in chronological order 1 < zi < 
• • • < zjv < N and represented by n dots lined up from 
right to left. The right most “o” represents the initial state 
jpo)- (b) The basic elements for the diagrammatic represen¬ 
tation of the contributions to the moments. 



FIG. 3. The leading-order diagrams for a few lowest moments 

{{s-{s).r)N. 

ii. When between two points is substituted 

by Vt, we leave the two points disconnected. 

iii. In the case where two or more points coincide, we 
connect the points by dashed lines. (Note that “o” 
cannot be connected by a dashed line.) 

See Fig. 2(b). There are two constraints due to 

(llf(i)lp*) = 0: 


a. The left most two points are surely connected either 

by a solid line or by a dashed line, since we cannot 
insert 7^* between them due to (Ijf = 0 with 

the left most trace (Ij. 

b. Each point (except for “o”) must be connected with 
at least one adjacent point either by a solid line or 
by a dashed line, since we cannot insert 7^* on both 
sides of a point due to (IJS^^^Jp*) = 0. 




Then, it is easy to draw the diagrams relevant to the 
leading-order contributions, with the largest possible 
number of 7^* inserted. The relevant diagrams for n = 
2, 3,4 are shown in Fig. 3 [see how the two diagrams 
for n = 2 correspond to the two leading-order terms in 
(4.18)]. For each diagram contributing to the nth mo¬ 
ment: 


1. Assign to each solid line. 

2. Insert V* = |p»)(l| for each space between discon¬ 
nected points. 


3. Turn each (group of) dot(s) (connected by 
dashed lines) into £("*1 (where m is the number 
of connected dots) while “o” into jpo)- 


4. Close each diagram with a trace (1| at the left end. 

5. Put X) ■ ■ ■ X) to sum the contributions over all 
possible distances between nondegenerate points 
respecting the chronological ordering of the points, 
with an appropriate coefficient counting how many 
times such a diagram (the specific ordering of the 
points) appears in the original full range sum¬ 
mation ^ X] ■ ■ ■ S exploring all possible order¬ 
ings of the points. The right coefficient reads 
n!/mi! m 2 ! • • •, where rrii are the numbers of co¬ 
incident points connected by dashed lines in the 
relevant diagram and the factors are to disre¬ 
gard the orderings among the coincident points. 

It is easily recognized from Fig. 3 that the maximum 
number of Vs, we can insert for the nth moment is given 
by [|J (where now [xj denotes the largest integer not 
greater than x). Therefore, the substitution rules in 
(4.19) tell us that the nth moment scales as anticipated in 
(4.10) (the power of N is obtained by n— [|J = [f ]). As 
discussed in the beginning of this subsection, this is the 
right scaling for the central limit theorem, and only the 
even moments (n = 2,4, 6 ,...) are relevant. An impor¬ 
tant observation is that the leading-order contributions 
to the even moments are independent of the initial state 
po, since “o” representing the initial state jpo) is always 
disconnected from the first . See Fig. 3. 

Let us look more carefully at the fourth moment. The 
leading-order contributions represented by the diagrams 
in Fig. 3 read 


{{S-{S)s,f)r, 

4! 1 ^ 


'' J 3 = 2 ii = l 


1^—2. ti — i 

N 23 —1 ^2 —1 




^ 3=3 ^2=2 ii — 1 




4! 1 


N 24 — 123 — 1 




24=3 23=2 ii — l 




+ 4! 


1 

w 


N 24—1 23 — 1 22 — 1 

E E E 


1^—4 23 = 3 22 = 2 2i=l 




+ 0(l/fV^) 

+ 4!^^iV(iV-l)(l|£W^^£W|p*) 

x(l|£W^£W|p*) 

+ 0{1/N^). (4.21) 


This suggests that by the summations each of the 
leading-order contributions to an even moment ((S' — 
(^)*) ")N acquires a common factor (binomial coefficient) 
N\/{n/2)\{N—n/2)\ while each is transformed 

into (1 — £')~^. Indeed, each leading-order diagram for 
an even moment consists of pairs of points (i 2 r-i)* 2 r) 
(r = l,...,n/ 2 ) connected by dashed or solid lines 
(see Fig. 3), and its evaluation proceeds two points by 
two points with the help of the following formulas for 
r = l,...,n/2 (with a convention in+i = -I- 1): for 

pairs of coincident points connected by dashed lines 


E 


(*2r-l — 1)! 

(r - I)!(t2r-1 - ?■)! 


(*2r+l — 1)! 

r\{i2r+i - r - 1)1 


(4.22) 


[we have a single sum for each coincident pair: see (4.21)], 
while for pairs of nondegenerate points connected by solid 
lines 


22 r + l —1 22 r —1 

E E £"■ 

22r=2’+l i2r-l—r 
22 r+l — 1 


(^2r-l 1)! 

(r - l)!(l 2 r-l - ?■)! 


= E 


1 - £'i 2 .+ i-i 2 .-i-l _ 1)! 


l-£' 


(r - I)!(z 2 r -1 - r)! 


(Z2r-t-l 1)! 1 -r—l \ /. OQ^ 

4(i2,+i-r-l)!l-£' (4-23) 


[The actual ranges of the summations in the leading- 
order contributions to the even moments are slightly dif¬ 
ferent from those in (4.22) and (4.23), but the correc¬ 
tions are finite and become negligible in the asymptotic 
regime N ^ n.j In this way, each leading-order dia¬ 
gram for an even moment acquires the binomial coeffi¬ 
cient N\/{n/2)\{N — n/2)! with being trans¬ 

formed into (1 — £')~^. This leads us to the following 
recipe for obtaining the expressions for the leading-order 
contributions to any even moment directly from the rel¬ 
evant diagrams: 
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n = 2 n = 4 

1 1 1 


+ + 



FIG. 4. The leading-order contributions to the even moments 
factorize. 


1'. Assign (l|£(^)|p*)/2 to each pair of points con¬ 
nected by a dashed line. 

2'. Assign (1|£(^)(1 — to each pair of 

points connected by a solid line. 

3'. Give a common factor n\N\/N'^{n/2)\{N — n/2)\ ^ 
n\/N'^/‘^{n/2)\ to each diagram. 

Then, the leading-order contributions to the even mo¬ 
ments factorize as shown in Fig. 4 and yield (4.11). 


V. USE OF CORRELATIONS 

An important difference from the standard strategy 
for parameter estimation, where independent identical 
experiments are performed to collect data, is that in 
the present sequential scheme the correlations among the 
measurement data are available for estimation. Combin¬ 
ing the information attainable from the correlations with 
that from the average S, the precision of the estimation 
can be enhanced. The primary motivation of the present 
paper is to explore this possibility. 

For instance, one can compute 

^ N-l 

Cl = (^-1) 

from a single sequence of N measurement outcomes 
{si,..., sn}, which captures the correlation between two 
data separated by a distance £. In the presence of the cor¬ 
relations among the data, Cg may depend on the target 
parameter g in a way that cannot be deduced solely from 
S. This might provide additional knowledge on how the 
parameter g is encoded in the process and can enhance 
the precision of the estimation of g. 

In principle, £ ranges £ = 1,... ,N — 1, but recall that 
the correlation between two data are expected to decay 
exponentially as £ increases under a mixing channel: Ci 
with £ greater than the correlation length would not con¬ 
tain useful information. In addition, N should be much 


greater than £ so that the number of data used to 

evaluate Ci is large enough. Therefore, we will require 
N ^ L > £, with L being the maximum £ we take to 
estimate the parameter g. 

The correlations Ce are also self-averaging quantities. 
Moreover, we are able to prove that the central limit the¬ 
orem holds for the set of quantities X = {S,Ci, ... ,Cl). 
First, the expectation value of is evaluated as 

{Ci)n = X! ■ ■ ■ X! Cip{si ,..., swIpo) 

Si SjM 

^ N-l 

~ jY _ £ ^ ^ ' ■ ■ ■ ^ ^ 'SiSi+^(l|fsjv ■ ■ ■ ^si IPo) 

i —\ s \ sn 
N—l 

i—l 

1 -j _ cfN—l 

= Q.Ipo) 

(C,)., (5.2) 

which approaches 

(Q)* = (1) Ip*) = (s,s,+,)* (5.3) 

under the ergodicity of the channel S. Then, let us look 
at the nth moment 

A:o(5-(5)*) + ^A:,(Q-(Q)*)] \ . 

(5.4) 

It is an nth-order polynomial of A ; = (kg,..., k^), and 
is a collection of all the nth moments among X = 
(S', Cl,..., Cl) as its coefficients. Since we are interested 
in the asymptotic limit N —i oo, we collect the leading- 
order contributions to Pn{k) for large N. The idea to do 
that is basically the same as that for S\ we try to insert 
as many as possible in place of £ between points 5si 
from S — (S)* and pairs of points 



^i,l^il^i-\-l)l — {Cl}^ (^■^) 

from Ci — {Ci)t. Since we have (<5si)* = (IjS^^^jp*) = 0 
[Eq. (4.17)] and 


{5{siSi+i)i)» = {l\£^f^^\pCj = 0, (5.6) 

where 

£^l^ = E E £s£^-^£s -, (5.7) 

s s' 

there should be at least two pieces [points 6si and/or 
pairs of points 6{siSi+i)i] between two V^. We just have 
to generalize the diagrammatic rules in Fig. 2: we repre¬ 
sent each pair of points 5{siSi+i)i by a dot too, and 
if the pair overlaps with another pair or a point 5si we 
connect the couple of dots by a dashed line (see Fig. 
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Ssi S(^SjSj^£'j S(^SjSj^£f'j 



FIG. 5. Each pair of points S{siSi+i)i is also represented by 
a dot and if the pair overlaps with another pair or a 
point Ssi, we connect the couple of dots by a dashed line. 
Several terms are involved in such a single connected diagram 
as shown here. These diagrams represent and S^'^} in 
(5.12) and (5.13). 


5), while if it does not we leave it disconnected from or 
connect it with its adjacent dot by a solid line depend¬ 
ing on whether V» is inserted between them or not. The 
ranges of the summations '' '12 exploring all possible 
distances between dots should be carefully arranged 
depending on whether the dots represent points Ssi or 
pairs of points 6{siSi+£)£, and some of the prefactors in 
l/N"^ are replaced by 1/{N — i), but such details become 
irrelevant in the asymptotic regime N ^ n, L. Then, the 
analysis goes in the same way as before, the leading-order 
diagrams are again given by Fig. 3, and the nth moment 
/i„(fc) for an even n asymptotically factorizes pairwise as 
Fig. 4, where the pair of dots connected by a dashed 
line or a solid line represents the collection of all pairwise 
combinations among Ssi and S{siSi+i)£ {i = 1 ,...,L), 


with some care on the coefficients to distinguish differ¬ 
ent orderings of the pieces, i.e., give a coefficient 1/2 to 
the pair connected by the solid line in Fig. 4 and col¬ 
lect contributions with different orderings of the pieces 
[see the second and third terms in Sno: and in 

(5.9)-(5.11) below]. We get 

/' L L 

\e=o f'=o 
-bO(l/iVt+i) 


n/2 


/i„(fc) = < 


( 2 Af)"/ 2 („/ 2 )! 


^o(i/ivrti) 


where 


(n even), 

(n odd), 

(5.8) 


Soo = + 2(l|£«^^£(i)|p*) = (T^, (5.9) 

+ (5-10) 

+ (5-11) 


with and dehned in (4.7) and (5.7), respectively, 
4"^ = E + <5si)J(s2Sl), £s,£^-^£s. 

Si S 2 
£-1 

+ EEEE (5s2(5(s3Si)i!£s3£'' ^£s2^^ ^ 

k —1 Si S2 S3 

(5.12) 


c(2) CO** I \ ' 00*0* 

~ + 2-^ ,k 

k=l 


+ Sit £2* + (1 — Sw)i^u* + + ^£t°*) 

(5.13) 


composed of 

4* = (5.14) 

Si S2 


4T = E E E 6{s2Si)e,£s,£^-^£s2£^'-^£s, + ^ f), 

Si S2 S3 

47'?* = E E E E 4^452)^ + (^ o f), 

Si S2 S3 S4 

47* = E E E , 

Si S2 S3 


(5.15) 

(5.16) 


(5.17) 
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4 '* = (5-18) 

-Si -S2 S 3 
\i-£'\-l 

4'°* = E EEEE'^("3S2)n.in(^.^-)5(S4Sl)^ax(A,0^^4^"-'f.a^””^'’''^-'^.2f"-'''-"-'^a,, (5.19) 

k—1 Si S2 S3 S4 


corresponding to the diagrams in Fig. 5. We provide the 
complete expressions for the covariances among S and 
Cl valid for any (even small) N in Appendix A, whose 
asymptotic forms coincide with the covariances (5.9)- 
(5.11) divided by N. 

This result shows that the set of scaled variables 
VN{S — (S')*) and \/N{Ci — {Ci)^) asymptotically be¬ 
come normal in the limit N ^ oo. The characteristic 
function reads 




N 


n.l 


n—0 

oo 


E)d = 


-ifc" Sfc 
fc- 1 


r—0 


^=0 ^'=0 


(5.20) 


where E is the (T-l-1) x (T-l-1) matrix with its matrix ele¬ 
ments given by the covariances in (5.9)-(5.11). The cen¬ 
tral limit theorem holds, and the probability distribution 
P{X) of X = (S, Cl,..., Cl) becomes asymptotically 
Gaussian, 


P{X) 


^-\N{X-(X),Ct.-\X-{X),) 

i/(27r/X)i+idetE 


(5.21) 


peaked at X = (X)* with a shrinking covariance T,/N. 
This ensures that the single-shot values X computed 
from a single sequence of measurement data well rep¬ 
resent their expectation values (X)*, through which we 
can estimate a parameter g. The uncertainty Sg in the 
estimation of g is given by (3.8) with the Fisher informa¬ 
tion 


J^Lig) = J d^+^XP{X) (^4inP(X)) 




dg 


dg 


(5.22) 


which increases linearly in N, and the uncertainty 5g 
diminishes as 5g ~ I/'/N [in (5.22) we have omitted the 
contribution from i9E /dg to the Fisher information Xl {g) 
since it does not grow with N], Moreover, this Fisher 
information Phig) for the estimation of g through a set 
of quantities X = (5', Ci,..., Cl) can be greater than 
the Fisher information P{g) = Xo(g) given in (3.9) for 
the estimation of the same g but solely through S. The 
precision of the estimation can be enhanced by looking 
at the correlation data Ci in addition to the average S. 


Here we have considered the two-point correlations Ci 
as well as the average S. If we incorporate higher-order 
correlations with more points, the precision of the estima¬ 
tion can be further improved. On the other hand, corre¬ 
lations with too many points would not be helpful, since 
the number of data used to evaluate such correlations 
is reduced, and some of the points involved in the cor¬ 
relations are separated beyond the correlation length of 
the mixing channel supplying no more information than 
lower-order correlations. It is currently not clear to what 
extent we can improve the precision of the estimation by 
looking at higher-order correlations. 


VI. EXAMPLE: ESTIMATION OF THE 
TEMPERATURE OF A RESERVOIR 


In this section we analyze an explicit example, where 
the correlations among the data collected by the sequen¬ 
tial measurements would be useful for improving the es¬ 
timation of a parameter. The setting we consider is re¬ 
lated to quantum thermometry, which aims to use low¬ 
dimensional quantum systems (say qubits) as tempera¬ 
ture probes to minimize the undesired disturbance on the 
sample (see e.g. Refs. [34, 35] and references therein). 
Specifically we focus on the paradigmatic example with 
a qubit probe in contact with a thermal reservoir at a 
finite temperature T. Our goal is to estimate the tem¬ 
perature T of the reservoir by monitoring the relaxation 
dynamics induced on the qubit, which effectively plays 
the role of a local “thermometer” (Fig. 6). In our ap¬ 
proach we describe the probe-reservoir coupling in terms 
of the resulting Markovian master equation [30, 36-39] 



FIG. 6. We estimate the temperature of a thermal reservoir 
though measurements performed on a probe qubit in contact 
with the reservoir. 
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operating on the probe, i.e., 

- ^7+[o'+o--P(i) + p{t)<^+<^- - 2 cr-p(t)<T+] 

- ^ 7 -[CT-Cr+p(t) +p(t)cr_CT+ - 2(T+p(t)(T_], 

( 6 . 1 ) 

where p{t) represents the state of the qubit, Ml is the 
energy gap between the excited |t) and ground ||) states 
of the qubit, and 

= it)(ti - iDui, a+ = it)ai, (T_ = ii)(ti. (6.2) 

The two relaxation constants 7 + (for decay) and 7 _ (for 
excitation) are related to the temperature of the reser¬ 
voir T, respecting the detailed balance condition. For a 
bosonic thermal reservoir, they are given by [36-39] 

7 + = (1 + nth)7, 7- = nth7, nth = _ 1 > 

(6.3) 

with ks being the Boltzmann constant. We assume that 
the parameters 12 and 7 (i.e., the characteristics of the 
thermometer) are known. Estimating the temperature T 
is then equivalent to estimating 

7/3 = 7-h + 7- = 7 coth , (6.4) 

while 7 = 7 + — 7 _ is a known constant independent of 
the temperature T. The higher is the temperature, the 
larger is the decay rate 7 ^ 3 . 


A. Standard Strategy 

The information about the temperature T, namely, 
the parameter 7 ^ 3 , is imprinted in the state of the qubit 
through the dynamics under the influence of the thermal 
reservoir, i.e., by the action of the quantum channel At 
which is the solution to the master equation (6.1). Then, 
the standard strategy to estimate the parameter 7,9 is 

(i) to prepare the qubit in a specific initial state pq, 

(ii) to let the qubit evolve p{t) = Ar{po) for a certain 
time r in contact with the thermal reservoir, and 

(hi) to measure a specific observable in the state p{t). 

We repeat this experiment N times to collect measure¬ 
ment results, from which we estimate the parameter 7 ^. 

For instance, we prepare the qubit in a specific initial 
state po, say in the excited state jf), and after a Hxed 
waiting time t we measure the qubit to check whether it 
is in the excited state |t) or in the ground state ]),). We 
repeat this process N times, and we estimate jp from 
the survival probability of the initial state jf) after time 


T. Our measurement however can be weak and unsharp: 
here we consider the measurement which provokes the 
following back-action on the qubit, 

p^ Ms{p) = MspMl (s = ±l) (6.5) 

with 

[m+i =cos?7|t)(t| +sinr 7 |i)(i|, 

{ ( 6 . 6 ) 
= sin?7|t)(t| +cos? 7 |i)(i|, 

depending on the outcome of the measurement s. This 
measurement process can be simulated with a CNOT gate 
[40, 41]. The parameter 77 controls the precision and the 
strength of the measurement: rj = 0 provides the projec¬ 
tive measurement, while with ry = 7 r /4 the measurement 
gives totally random results with no disturbance on the 
measured system. The probability of obtaining the mea¬ 
surement outcome s in the state p{t) is then given by 

Pr{s\po) = Tr{Ms(p(T))} = Tr{n^p(T)}, (6.7) 

where 

fcos 2 ? 7 lt)(t|+sin 2 ? 7 li)(il (s =-kl), 

[sin 2 r 7 |t)(t|+cos 2 ? 7 li)(il (s =- 1 ) 

( 6 . 8 ) 

are the POVM elements of this measurement. The uncer¬ 
tainty in the estimation is then bounded by the Cramer- 
Rao inequality [3, 7-12, 33] 

SjB > — (6.9) 

with the Fisher information given by 

f c) \ ^ 

Pilp) = ^ Pr{s\po) y-^Mpris\po)j ■ ( 6 . 10 ) 


For the present model, the Bloch vector of the qubit 
evolves as 


{c^x)t = e (^{ax)oCOsD,t - (cry)o sinOt^ , 

< (o-y)t = ((o- 2 :)o sin fit + {ay)o cos Qtj , 

\ 7/3/ 7/3 

( 6 . 11 ) 

where ax = <7+ + ct_ and ay = —i{a+ — ct_). The equi¬ 
librium state peq is characterized by 


{^x}eq — {^y}eq — O 5 


{<^z) 


eq — 


7 / 3 ’ 


namely. 


Peq — 


1 

2 



^—/3hQazl2 
^-i3hn(T^j2 ■ 


( 6 . 12 ) 


(6.13) 
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FIG. 7. The dependence of the quantum Fisher information Fq{'^p) given in (6.16) on the polar angle 0 of a generic pure 
initial state [ipo) = cos( 0 / 2 )|t) + sin( 0 / 2 )| 4 ,) and on the waiting time r, for different values of 7 / 3 / 7 , i.e., for different 

temperatures. Note that Fq(^p) is symmetric around the polar axis and is independent of the azimuthal angle ip. 


The probability distribution of the outcomes of the mea¬ 
surement (6.7) at time r reads 

Pr(±l|po) = ^ (1 (cr^)^ cos 277 ), (6.14) 

and the Fisher information F{'^p) in (6.10) is estimated 
to be 


F{lp) 


cos ^ 2?7 /d{az)T\ 
1 - {a^)l cos22?7 V 97/3 ) 


(6.15) 


A larger Fisher information would be attainable by 
measuring a different observable. The maximum Fisher 
information one can reach with the optimal measurement 
is given by the quantum Fisher information [2, 3, 7-12], 

= = (6.16) 

with the symmetric logarithmic derivative 

= (6.17) 


where is a 3 x 3 matrix whose matrix elements are 
given by 


Vij = Sij - {ai)r{crj)r {i,j = x,y,z). (6.18) 

Notice here that both the Fisher information 
in (6.15) and the quantum Fisher information Fq^Ji^) 
in (6.16) depend on the choice of the initial state po- 
Because of the convexity of the quantum Fisher infor¬ 
mation, the maximum of the quantum Fisher informa¬ 
tion (the best estimation) is always achieved by choos¬ 
ing a pure input state po = \tpo){'4’o\ [3, 42]. Moreover, 
for the present problem, the ground state of the qubit 
l'0o) = li) is the optimal choice, in the sense that the 
maximum of Fq^'^p) for a given temperature is achieved 
with \tpo) = li): see Fig. 7, and the temporal behavior 
of Fq{’jp) for \ipo) = li) is plotted in Fig. 8. For this 
specific initial state po = |i)(i|, the Fisher information 
F("/p) in (6.15) with rj = 0 coincides with the quantum 
Fisher information Fq{'^p) in (6.16), for any time r and 



FIG. 8 . The temporal behavior of the quantum Fisher infor¬ 
mation ^< 3 ( 7 / 3 ) given in (6.16) for po = l 4 -)( 4 -l and for different 
7/3 (for different temperatures). 



FIG. 9. The temporal behavior of the Fisher information 
F( 7 / 3 ) given in (6.15) for po = |4.)(4I and 73/7 = 1.5 with 
different strengths of the measurement p. In the case of pro¬ 
jective measurement p — 0, the Fisher information F{pp) 
coincides with the quantum Fisher information FQ{'yp) given 
in (6.16) and plotted in Fig. 8 . 


for any 7 / 3 : the projective measurement to discriminate 
It) and It) is the optimal measurement. For nonvanish¬ 
ing 77 > 0 the Fisher information Fl-jp) is reduced, and 
the weaker is the measurement, the smaller is the Fisher 
information F^jp), as shown in Fig. 9. 
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B. Sequential Scheme 

Let us now turn our attention to the sequential scheme. 
First, it is important to check whether the channel £ 
defined in (3.5) with Eg in (3.2) is mixing. For the present 
model, the spectrum of £ is given by 

(6.19) 


direct consequence of the irreversibility of the relaxation 
process Aj of the probe qubit. Since £ is mixing, the 
sequential scheme works for the present problem. 

Let us take the average of the outcomes of a sequence 
of N measurements, S defined in (3.6), as a quantity 
through which we estimate 7 ^ 3 . For the present model, 
its expectation value is computed to be 


and therefore, £ is mixing for any r > 0 with a unique 
fixed point (the eigenstate belonging to the eigenvalue 1 ) 

P* = Peq, (6.20) 

which coincides with the equilibrium state peq in (6.13) of 
the free relaxation process. This mixing is apparently a 


{ S)n 


. 7/3 


1 1 - ( 
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J cos2?7 
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( 6 . 21 ) 


and the variance to be 
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( 6 . 22 ) 


As N increases, both become independent of the initial 
state po, and the variance (AS')^ shrinks as 1/N, 


(S) 


{£^S)% —)■ — 


N 


sin^2?7 - 


- - cos 277 , 

7/3 


1 + 
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(6.23) 



cos^277 


(6.24) 


In other words, S evaluated from a single sequence of 
measurements almost certainly exhibits a value very close 
to its expectation value {S)n, which is a function of 'jp. 
Therefore, by comparing S (obtained via a single experi¬ 
mental run) with its expectation value (S) n [given by the 
formula (6.23)], the parameter 7 ^ is estimated with the 
uncertainty regulated by the variance (AS')^ in (6.24), 


i.e., with the precision given by the Fisher information 
•^( 7 / 3 ) = //)(7/3) in (3.9), 


-^ 0 ( 7 / 3 ) 
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1—^1 cos^2?7 


(6.25) 

which is to be compared with the Fisher information 
with (6.15) by the standard strategy (see Fig. 
10 below). 

As stressed above, the correlations among the acquired 
data are also available for the estimation in the sequen¬ 
tial scheme. For instance, the two-point correlations Ce 
defined in (5.1) can be used to estimate jp. Their ex¬ 
pectation values (for a generic initial state po) are given 

by 


{C£)n = 



1 1 _ g-(3V-t)7^r 


N-. 


e'lpr _ I 


(1 _ 
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1 7 
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>0,N>e+l), (6.26) 
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and their covariances (in the stationary state po = p*) by 
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(^ > 1, iV > £ + 1) 
(6.28) 


[the complete expression for the covariance {CiCi')^ — 
{Ce)N{Ci')N valid for any £,£' > 1 and N > max(^, £') + ! 
(but for Po = P*) is given in Appendix B]. All the co- 
variances scale as 1/A^, and the Fisher information (5.22) 
increases linearly in N. This ensures that, by comparing 
the set of quantities (S', Ci, ..., Cl) evaluated from a sin¬ 
gle sequence of measurement data with the set of their 
expectation values {{S)n, {Ci)n, ■ ■ ■, {Cl)n), one can es¬ 
timate 7 p with the precision given by the Fisher informa¬ 
tion £Fl{ip) computed by the formula (5.22), which in¬ 
creases linearly in N. It is reasonable to expect that the 
estimation with the multiple quantities (S, Ci,..., Cl) 
is better in precision than the estimation solely through 
the average S, namely, the Fisher information £Fl{ip) 
{L > 0) is larger than the Fisher information £Fo{'-fp), 
and the more correlations are incorporated (the larger 
is the number L), the larger is the Fisher information 

Let us look at two different regimes. 


1. Projective Measurement p = 0 

The Fisher informations Tbicp) = Oj 1) 2) by the se¬ 
quential scheme are plotted in the five panels in Fig. 10(a) 
and are compared with the Fisher information Fip^p) by 
the standard strategy with po = IDO-I) for the case of 
projective measurements p = 0. In this case, the Fisher 
information F{"fp) coincides with the quantum Fisher in¬ 
formation Fq{'^p) in the standard strategy. 

Compare first F{'jp) and Fo{jp)/N (per measure¬ 
ment). We observe that the standard strategy provides 
better estimation than the sequential scheme. Recall here 
that the input state po = |i)(i| for C( 7 p) is the optimal 
for the standard strategy. On the other hand, in the se¬ 


quential scheme, the state of the qubit is projected into 
It) or It) depending on the outcome of the projective 
measurement. If it is projected into |t) by a measure¬ 
ment, it restarts to evolve from this non-optimal state for 
the next measurement. Not all the steps in the sequen¬ 
tial measurements are optimal for the estimation. That 
is why the sequential scheme cannot beat the standard 
strategy, in the case of projective measurement. 

One can improve the performance of the sequential 
scheme, by incorporating Ci for the estimation. Indeed, 
as is clear from Fig. 10(a), the Fisher information J^i( 7 p) 
for the estimation through (S', Ci) is greater than the 
Fisher information Fo{'jp) solely through S. Note that 
no additional resources or experiments are required to 
incorporate Ci: one simply needs to carry out additional 
data analysis to compute Ci from the data used to eval¬ 
uate S. In Fig. 10(b), the gain in the Fisher information 
by incorporating Ci is shown for different temperatures. 

On the other hand, incorporating more correlation 
data, i.e., Cf with £ > 1, does not help improve the es¬ 
timation. See Fig. 10(a) again. This is because every 
time one performs measurement the system is reset to 
a pure state by the projective measurement: there is no 
correlation between the measurement results separated 
over two steps. The system simply repeats the same dy¬ 
namics, jumping between pure states |t) and |j,), and the 
measurement after multiple steps gains no more infor¬ 
mation than that attainable by the measurement after a 
single step. 


2. Weak Measurement p > 0 

Let us next look at the cases with weak measurements 
p > 0. As is clear from Fig. 10(c), the sequential scheme 
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FIG. 10. The Fisher informations J^L{'yp)/N per measurement by the sequential scheme (solid lines) are compared with the 
Fisher information F['yp) by the standard strategy with po = |4-)(4'l (dashed lines) for (a) projective measurements p = 0 and 
for (c) weak measurements p > 0. The gains — J'L-ii'yp)]/^ 0 (^ 0 ) by incorporating the correlation Cl are shown in 

(b) for projective measurements 77 = 0 and in (d) for weak measurements 77 > 0. 
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can be better than the standard strategy. In particular, 
at low temperatures, the Fisher informations TL{'yi 3 )/N 
(L > 0) by the sequential scheme exceed the Fisher in¬ 
formation F{^ 0 ) by the standard strategy. 

The reason is the following. In the standard strat¬ 
egy, the weak measurement is performed only once, and 
the system is reset to the specific initial state po for the 
next measurement. The single weak measurement can 
acquire less information than a projective measurement, 
but if it is repeated many times, as in the sequential 
scheme, the information is accumulated, and better in¬ 
formation is gained in our hands. At the same time, the 
system is gradually projected to one of the eigenstates of 
the measured observable by the repeated weak measure¬ 
ments [43]. In other words, the repetition of the weak 
measurements mimics a stronger measurement (closer to 
a projective measurement). That is why the sequential 
scheme can be better than the standard strategy, in the 
case of weak measurement. 

It is also clear from Fig. 10(c) that the precision of 
the estimation is improved by incorporating the cor¬ 
relation data Ci- The gain in the Fisher information 
[-^l( 7 / 3 ) - -^l-i( 7 / 3 )]/-^o by adding a correlation Cl to 
(S', Cl,..., Cl-i) is shown in Fig. 10(d). The enhance¬ 
ment is reminiscent when the time interval between mea¬ 
surements T is short, i.e., ^ 1. Moreover, the gain 

exhibits a peak at a smaller t for a larger L. This is 
because the two points of each two-point correlation Ci, 
separated by £ steps, should be within the correlation 
time Tc ~ 2 / 7,9 [which is ruled by the second largest 
eigenvalues of the mixing channel (6.19)], in 

order for the correlation C^ to bear useful information. 

It appears that the sequential scheme can beat the 
standard strategy only at low temperatures (small 7 ^ 3 ), 
but it should be noted that the standard strategy in Fig. 
10 assumes the optimal initial state po = I'OO-jj while 
in the sequential scheme the system is around the sta¬ 
tionary state p* of the mixing channel £, which is the 
thermal equilibrium state peq [see (6.20)]. It would be 
more appropriate to compare the Fisher informations 
£Fl{ip)IN by the sequential scheme with the Fisher in¬ 
formation F{"fp) by the standard strategy in the large r 
limit (which gives the Fisher information with the initial 
thermal state po = Peq)- 

VII. CONCLUSIONS 

The estimation of a parameter encoded in a quantum 
probe, through a series of measurements performed se¬ 
quentially on the probe, has been analyzed in a gen¬ 
eral non-i.i.d. setting. On the basis of a diagrammatic 
approach we have discussed the conditions under which 
the central limit theorem holds as the number of mea¬ 


surements increases, reproducing the previous results [19] 
and generalizing them to the case where the correlations 
among the measurement data are also taken into account 
in the estimation strategy. Our analysis explicitly shows 
that the latter strategy can yield a significant advantage 
over the standard procedure where only the average of 
the acquired data is considered. 

At present however it is not clear whether this is the 
best strategy one can do: it is indeed possible that differ¬ 
ent data processing (including the evaluation of higher- 
order correlations commented at the end of Sec. V) can 
improve further the attainable accuracy. In the exam¬ 
ple studied in Sec. VI, the sequential scheme surpassed 
the standard i.i.d. procedure when we are able to per¬ 
form only weak measurements, but could not beat the 
standard procedure when we are allowed to perform 
strong measurements. A better strategy for the sequen¬ 
tial scheme could beat the ultimate precision achievable 
by the standard strategy. The optimal strategy would 
require different measurements step by step, or moreover 
would require quantum-correlated measurements over 
different measurement probings. The use of entangle¬ 
ment is also an interesting possibility [44]. It is yet to be 
clarified what is the ultimate accuracy attainable in the 
sequential scheme for parameter estimation [45]. 

Recently, quantum metrology in the presence of noise 
is under intense study [44, 46-48]. The mixing property 
required for the sequential scheme is relevant to noisy 
channels, and connections with such issue would be in¬ 
teresting to be explored. 
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Appendix A: Complete Expressions for the 
Covariances 

In Sec. V, we derived the asymptotic expression (5.8) 
for the even moments among S and C^. Here we provide 
the complete expressions for the covariances among S and 
Ci valid for any (even small) N. Under the assumption 
that the quantum channel £ is ergodic (not necessarily 
mixing), they read 
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where and (As)^ are defined in (4.7) and (4.8), respectively, 4™^ 9-^6 in (5.7) and (5.12), and the other components 
are given in (5.14)-(5.19). 


Appendix B: Covariances among Ct for the Model 


In (6.27) in Sec. VI we showed the asymptotic expression for the covariance between Ci and Cgi for large N for the 
model. Here we provide its complete expression valid for any (even small) N. In the stationary state po = p*, the 
covariances between Cg and Ct {£>£'> 1 ) are given for V > ^ + ^' by 
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while iov ^ > N > (. + 1 hy 
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